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ABSTRACT 

We use the Millennium Simulation, a 10 billion particle simulation of the growth of 
cosmic structure, to construct a new model of galaxy clustering. We adopt a method- 
ology that falls midway between the traditional semi-analytic approach and the halo 
occupation distribution (HOD) approach. In our model, we adopt the positions and ve- 
locities of the galaxies that are predicted by following the orbits and merging histories 
of the substructures in the simulation. Rather than using star formation and feedback 
'recipes' to specify the physical properties of the galaxies, we adopt parametrized func- 
tions to relate these properties to the quantity Mi n f a u, defined as the mass of the halo 
at the epoch when the galaxy was last the central dominant object in its own halo. We 
test whether these parametrized relations allow us to recover the basic statistical prop- 
erties of galaxies in the semi-analytic catalogues, including the luminosity function, 
the stellar mass function and the shape and amplitude of the two-point correlation 
function evaluated in different stellar mass and luminosity ranges. We then use our 
model to interpret recent measurements of these quantities from Sloan Digital Sky 
Survey data. We derive relations between the luminosities and the stellar masses of 
galaxies in the local Universe and their host halo masses. Our results are in excellent 
agreement with recent determinations of these relations by Mandelbaum et al using 
galaxy-galaxy weak lensing measurements from the SDSS. 

Key words: galaxies: fundamental parameters - galaxies: haloes - galaxies: distances 
and redshifts - cosmology: theory - cosmology: dark matter - cosmology: large-scale 
structure 



1 INTRODUCTION 

According to the current standard paradigm, galaxies form 
and reside inside extended dark matter haloes. Three dif- 
ferent approaches have been used to model the link be- 
tween the properties of galaxies and the dark matter 
haloes in which they are found. One approach is to carry 
out N-body + hydrodyna mical simulations that include 
both gas and dark mat t er ([Katz et al" ] Il99d : IPearce et al.l 
l200ll : iwhite et all l200ll : lYoshikawa et all 120011 ). Another 
approach is to combine N-body simulations with simple 
prescriptions, ta ken directly from semi- analytic models of 
galaxy formation (|Kauffmann et al.lll999T) . to track gas cool- 
ing and star formation in galaxies. The third method is the 
so called Halo Occupation Distribution (HOD) approach, 
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which aims to provide a purely statistical description of how 
dark matter haloes are populated by galaxies. 

Typical HOD models are constructed by specifying the 
number of galaxies N that populate a dark matter halo 
of mass M as well as the distribution of galaxies within 



these haloes jKauffmann et al.lll997l;|Peacock fc Smith! 



2000 



2002 



Selialj[2000l: iBenson et al.l 120001 ; iBerlind fc Weinberg 
Berlind et al.ll2003l ). More recent models have concentrated 
on the so-called conditional luminosity function &(L\M )dL, 
which gives the number of galaxies of lumin osity L that 
reside in a halo of mass M |Yang et al.l 120031 1. Most HOD 
models also distinguish between "central" galaxies, located 
at the centres of dark matter haloes and "satellite" galax- 
ies, which are usually assumed to have the same density 
profile as the dark matter within the halo. Physically, this 
is supposed to reflect the fact that gas cools and accu- 
mulates at the halo centres until the halo merges with a 
larger structure. With this approach, the models can be 
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used to explore the parameters that are required to match 
simultaneously the galaxy luminosity function as well as the 
the luminosity, colo ur and morphology dependences of the 
corre l ation functionllvan den Bosch et a l, 2003; Z ehavi et al.l 
l2005l ; lYang et alj|2005h . Other papers have used HOD mod- 
els to e xplore the detailed shape of two-point correlation 
function dZehayi et alj|2004f) as w ell as higher order correla- 
tion functions (|Wang et alj|2004h . 

N-body simulations can now be carried out with 
high enough resolution to track the histories of indi- 
vidua l substructures (su bhaloes) within the surrounding 
halo l|Springel et al, 2001). It is thus becomes possible to 
specify the positions and velocities of galaxies within a halo 
in a dynamically consistent way, rather than assuming a 
profile or form for the velocity distribution. Galaxy cluster- 
ing statistics that are computed using the full information 
available from these high resolution simulations should in 
principle be more accurate and robust. 

Caution must be exercised, however, when only using 
subhaloes as tracers of galaxies in high resolution s i mula- 
tions , as has been recently done by IVale fc Ostrikerl (|2004l . 
120051 ); IConrov et ail (|2005h . In standard models of galaxy 
formation, when a galaxy is accreted by a larger system 
such as a cluster, its surrounding gas is shock-heated to high 
temperatures. Star formation then terminates as the inter- 
nal gas supply of the galaxy is used up. The stellar masses of 
satellite galaxies only change by a small amount after they 
are accreted, while their luminosities dim due to aging of 
their stars. In contrast, the dark matter haloes surrounding 
the satellites gra dually lose mass as t heir outer regions are 
tidally stripped (|De Lucia et alj|2004 ). Near the centres of 
the halos, most of the subs tructures have been completely 
destroyed. iGao et al I (|2004l) have shown that the radial dis- 
tribution of subhaloes is much less centrally concentrated 
than the radial distribution of galaxies predicted by simu- 
lations that follow the full orbital and merging histories of 
these systems. 

In this paper, we make use of the Millennium Simula- 
tion, a 10 billion particle simulation of the growth of cosmic 
structure, to construct a new model of galaxy clustering. We 
adopt a methodology that falls in between the semi-analytic 
approach, which tracks galaxy formation 'ab initio' within 
the simulation, and the HOD approach, which only provides 
a statistical description of how galaxies are related to the un- 
derlying dark matter density distribution. In our approach, 
we adopt the positions and velocities of the galaxies as pre- 
dicted by following the orbits and merging histories of the 
substructures in the simulation. Rather than using star for- 
mation and feedback 'recipes' to calculate how the physical 
properties of the galaxies such as their luminosities or stellar 
masses evolve with time, we adopt parametrized functions 
to relate these properties to the quantity Mi n f a ii, defined 
as the mass of the halo at the epoch when the galaxy was 
last the central dominant object. For central galaxies at the 
present day, Mi n j a u is simply the present day halo mass, 

1 Note that the simulations analyzed bv lConrov et all (I2005T) are 
significantly higher resolution than the ones analyzed in this pa- 
per, but are much smaller in volume. As discussed in their paper, 
the problem of disrupted subhaloes is not likely to be a signifi- 
cant problem for galaxies in the range of luminosities considered 
in their analysis. 



but for satellite galaxies, it is the mass of the halo when the 
galaxy was first accreted by a larger structure. 

This approach has the advantage of the semi-analytic 
models in that it provides very accurate positions and ve- 
locities for all the galaxies in the simulation. It maintains 
the simplicity of the HOD approach, because it bypasses 
the need to incorporate detailed treatment of star formation 
and feedback processes. Our aim in developing these mod- 
els is to use them as a means of constraining the relation 
between galaxy physical properties and halo mass directly 
from observational data, not as a means of understanding 
the physics of galaxy formation. 

The paper is organized as follow: we first introduce 
the Millennium Run and the methodology used for iden- 
tifying haloes, subhaloes and galaxies in this simulation. We 
then study the relation between luminosity /stellar mass and 
Mi n faii in mock galaxy catalogues constructed using these 
simulations. In Sec. [4] we introduce a parametrization for 
these relations and show that we are able to recover ba- 
sic statistical quantities such as the galaxy luminosity /mass 
function and the shape and amplitude of the two-point cor- 
relation function in different luminosity /mass bins. We also 
investigate the the effect of changing the parameters of the 
relation on the luminosity function and correlation func- 
tion. In Sec. [S] we apply the method to real data on the 
clusterin g of galaxies a s a function of luminosity and stel- 
lar mass (|Li et al.l 120061 ) derived from the Sloan Digital Sky 
Survey (SDSS). Finally, we discuss our results and present 
our conclusions. 



2 THE SIMULATION 

The Millennium Simulationi Spri ngel et all 120051 ) used in 
this study, is the largest simulation of cosmic structure 
growth carried out so far. The cosmological parameters val- 
ues in the simulation are consistent with rece nt determina- 
tions from a combined analysis of the 2dF GRS ( Collcs s~et al.l 
120011 ) and first year WMAP data (|Spergel et al.ll2003h . A flat 
ACDM cosmology is assumed with fi m = 0.25, fib = 0.045, 
h — 0.73, fiA = 0.75, n = 1, and as = 0.9. The simulation 
follows N = 2160 3 particles of mass 8.6 x 10 s ft _1 M from 
redshift z = 127 to the present day, within a comoving box 
of 500 /i -1 Mpc on a side. 

Full particle data are stored at 64 output times. For 
each output, haloes are identified using a friends-of- friends 
(FOF) group-finder. Substructures (or subhaloes) within a 
FOF halo are locat ed using the SUBFIND algorithm of 
ISpringel etabl i|200ll ). After finding all haloes and subhaloes 
at all output snapshots, merging trees are built describing 
in detail how these systems merge and grow as the universe 
evolves. Since structures merge hierarchically in CDM uni- 
verses, for any given halo, there can be several progenitors, 
but in general each halo or subhalo only has one descendant. 
Merger trees are thus constructed by defining a unique de- 
scendant for each halo and subhalo. Through those merging 
trees, we are able to follow the history of haloes/subhaloes, 
as well as the galaxies inside them. 

Once a halo appears in the simulation, it is assumed 
that a galaxy begins to form within it. As the simulation 
evolves, the halo may merge with a larger structure and be- 
come a subhalo, while the galaxy becomes a satellite galaxy. 
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Figure 1. Relations between stellar mass, baryonic mass and M in f a n calculated from the semi-analytic galaxy catalogues. Open circles 
represent central galaxies and triangles are for satellite galaxies. Error bars indicate the 95 percentiles of the mass distribution at the 
given value of Mi n j a u. Dashed lines show the double power law parametrized fit to the median value of relations for the galaxy sample 
as whole. 



The galaxy's position and velocity are specified by the po- 
sition and velocity of the most bound particle of its host 
halo/subhalo. Even if the subhalo hosting the galaxy is 
tidally disrupted, the position and velocity of the galaxy is 
still traced through this most bound particle. We will refer 
to these galaxies without subhaloes as "orphaned" systems. 
Galaxies thus only disappear from the simulation if they 
merge with another galaxy. The time taken for an orphaned 
galaxy to merge with the central object is given by the time 
taken for dynamical friction to erode its orbit, causing it 
to spiral into the centre and merge. The satellite orbits are 
thus tracked directly until the subhalo is disrupted; there- 
after, the time taken for the galaxy to reach the centre is 
calculated using the standard Chandrasekhar formula. 

In this paper, we will parameterize quantities such as 
galaxy luminosity and stellar mass as a function of the quan- 
tity Mi n f a u, which is defined as the virial mass of the halo 
hosting the galaxy at the epoch when it was last the central 
galaxy of its own halo. The Millennium simulation cata- 
logues include haloes down to a resolution limit of 20 parti- 
cles, which yields a minimum halo mass of 2 x 10 10 ft -1 M©. 
In our study, we only consider galaxies with Mi n f a u greater 
than 10 10 - 5 /i _1 MO.(Note that M in } a u is simply the virial 
mass of the host halo for central galaxies at the present day.) 
This results in a total sample of 11761178 galaxies within 
the simulation volume. 



3 THE RELATIONS BETWEEN M INFA ll, 
STELLAR MASS AND LUMINOSITY IN 
THE SEMI-ANALYTIC GALAXY 
CATALOGUES 

In the following two sections we use the semi- 
analytic galaxy catalogues co nstructed fr om the 
Millennium simulation by (|Croton et al.l l2006t ) 



| |http:/ /www.mpa-garching.m pg.de/galform/agnpaper/) to 
study how galaxy properties such as stellar mass, baryonic 
mass (i.e. stellar mass+ cold gas mass) and luminosity 
depend on Mi n f a ii, the mass of the halo in which the 
galaxy was last a central object. We construct parametrized 
relations between these quantities and Mi n f a u that match 
the relations found in the mock catalogue. We then show 
that our parametrization allows us to recover both the 
luminosity /mass functions of the simulated galaxies and 
the shape, amplitude and mass /lumi nosity depend e nce o f 
the two-point correlation functions. ICroton et al.l (2006) 
have shown that their catalogues provide a good match to 
the observed galaxy luminosity function and the clustering 
properties of galaxies, so we believe that it is a reasonable 
to use these catalogues as a way of motivating and testing 
our simple parametrizations. 

In Fig. [T] we plot the relations between Mi n f a u and 
galaxy stellar mass (M ata rs) and baryonic mass (Mt, aryon ) . 
We show results for present-day central galaxies in blue and 
satellite galaxies in red. Error bars indicate the 95th per- 
centiles of the distributions. As we will show, the relations 
between M in f a u and M 3t ars/ M baryon are well described 
by a double power law. The crossover point between the 
two power laws is at a halo mass of ~ 3 x 10 11 h~ x Mq, 
which corresponds to a galaxy with stellar mass of around 
10 10 /i- 1 M Q . In less massive haloes, supernova feedback acts 
to prevent gas from cooling and forming stars as efficiently 
as in high mass haloes. In massive haloes, the cooling times 
become longer and a smaller fraction of the baryons are pre- 
dicted to coo l and f orm stars. In addition, in the models of 
ICroton et al.l (|2006f ). heating from AGN also acts to sup- 
press cooling onto high mass galaxies. 

Fig. [2] shows that the distribution of M s t ar s at a 
given value of M in f a u is well-described by a log-normal 
function. The width of the lognormal depends weakly on 
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Figure 2. The distribution in stellar mass for different Mi n j a ii 
bins increasing from lO 1O ' 5 ?t- 1 M0(left) to IO 14 /! -1 A/0(right). 
Solid and dashed lines are for central and satellite galaxies (note 
that they lie on top of each other for three lower mass bins). 
Dotted lines indicate the Gaussian fits to the distributions that 
are used in our parametrized model. 
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Figure 3. The same as in Fig. [T] but luminosity (represented 
by magnitude of bj band) is plotted as a function of Minfall- 
Dashed(dotted) lines show the double power law fits to the rela- 
tion for central(satellite) galaxies. 



halo mass with a maximum dispersion a of 0.2 dex at 
Mi n f a u = 10 10 ' 5 ft -1 Mq and a minimum a of 0.1 dex at 
M in fall = lO 11 - 5 /i" 1 M . The relations depend very little 
on whether the galaxy is a central or satellite system. The 
dispersion around the relations is also similar for the two 
types(in Fig. [2j the solid and dashed lines for central and 
satellite galaxies lie almost on top of each other). The simi- 
larity in the M in f aH - M sta rs relations between satellite and 
centrals may be regarded as something of a coincidence. Al- 
though there is little change in the stellar/baryonic compo- 
nent of the galaxy after it falls into a larger structure, halos 
of the same mass at different times have different circular 
velocities and hence different cooling and and star forma- 
tion efficiencies. As we will show later, we obtain better fits 
to the observational data if we allow the relations between 
central and satellites galaxies to differ. 

Fig. [3] shows the relation between luminosity and 
Mi n f a u . It also can be fit by a double power law, but the dif- 
ference between central and satellite galaxies is much more 
obvious. At a given value of Minfall central galaxies are more 
luminous than satellites because they are forming stars at 
higher rates and their stellar mass-to-light ratios are lower. 
The difference between central and satellite galaxies be- 
comes very small at large values of Mi n f a u. This can be 
understood as a simple consequence of hierarchical struc- 
ture formation: massive haloes were formed more recently 
than less massive haloes and subhaloes with large masses 
are likely to have been accreted relatively recently. Massive 
satellite galaxies have therefore not been satellites for long 
and thus have mass-to-light ratios that are more similar to 
their central counterparts. In addition the Croton et al mod- 
els include a "radio AGN mode" of feedback, which acts to 
suppress cooling onto the most massive galaxies. This also 
acts to reduce the difference between central and satellite 
galaxy colours and mass-to- light ratios. 



4 PARAMETRIZATIONS AND TESTS 
4.1 Functional form 

We use a two-power-law model of the following form to fit 
the median value of the relations between M stars, M oaryon , 
L and Minfaii- 

2 



where x denotes M a tars,M aaryon or L, and the relation be- 
tween luminosity L and bj band magnitude is given by: 

M b] - 51og/i = -2.51ogL 

Wc fit these relations for central and satellite galaxies sep- 
arately, as well as for the galaxy population as a whole. We 
will later test whether separate fits to the central galaxies 
and satellites make significant difference to our results. We 
also assume that the dispersion around the median value has 
a lognormal form. 

Table 1 lists the parameters of the best-fitting models 
for the relations between Minfall and M stars, Mtaryon and 



L. The models have five parameters. 



(i) Mo is the critical mass/luminosity at which the slope 
of the relation changes. When we fit satellite and central 
galaxies separately, we find almost exactly the same values 
for this parameter (even for luminosity, the difference is less 
than 20%). We therefore fix Mo at the best-fit value for the 
galaxy sample as a whole. 

(ii) a and j3 describe the slope of the relations at high 
and low values of M in f a ii- 

(iii) k is a normalization constant. 

(iv) We have calculated the interval in log M stars, 
log Mbaryon and log L that encloses the central 68% of the 
probability distribution for 8 different values of log Mi n f a ii 
from W 10 5 h- 1 MQ to 10 14 /i _1 MQ, with step 0.5 dex. We 
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Table 1. Best-fit p arameter valu e s for the relations between Mi n f a u and Mstars, Mbaryon an d L as derived from the semi-analytic 
galaxy catalogues of lCroton et al 







M o (h- X M ) 


a 


P 


log(k) 




x 2 


Mstars 


total 
central 
satellite 


3.16X10 11 
3.16X10 11 
3.16X10 11 


0.39 
0.39 
0.39 


1.92 
1.96 
1.83 


10.35 
10.35 
10.34 


0.156 
0.148 
0.167 


0.0146 
0.0240 
0.0057 


-^baryon 


total 
central 
satellite 


3.61X10 11 
3.61X10 11 
3.61X10 11 


0.36 
0.35 
0.37 


1.59 
1.59 
1.59 


10.44 
10.46 
10.40 


0.147 
0.133 
0.162 


0.0415 
0.0542 
0.0273 


L{M h] ) 


total 
central 
satellite 


1.49X10 11 
1.49X10 11 
1.49X10 11 


0.36 
0.31 
0.46 


1.90 
1.99 
1.81 


7.14 
7.25 
6.90 


0.215 
0.169 
0.189 


0.0360 
0.1250 
0.0359 



then calculate the average of these 8 values and the value of 
a quoted in Table 1 is 0.5 times this number. 

The resulting model fits are plotted as dashed and dot- 
ted lines in Fig. [T] and Fig. [3] The quality of the fit is given 
in the last column of Table 1 and is calculated as: 

-2 / x fit ~ XSAM n2 

X = Z_v ) 

z — ' XSAM 

where x represents M stars, Mb aryon , L for each re- 
lation, and the sum is over the 8 mass bins with 
W^h^M^Minfaii < \Q XA -°h- x MQ. 

4.2 Tests 

The next step is to see whether these parametrized relations 
allow us to recover the basic statistical properties of the sim- 
ulated galaxy catalogue, such as the mass/luminosity func- 
tion and the mass/luminosity dependence of the two point 
correlation function. When fitting to the quantities M 3ta r 
and Mbaryon, we do not distinguish between central and 
satellite galaxies because the relations are almost the same 
for both. When fitting to galaxy luminosity, we do allow a, 
/3, k and a to vary between central and satellite galaxies, but 
Mo remains fixed for both. Note that the positions and the 
velocities of the galaxies are exactly the same as specified 
in the semi-analytic galaxy catalogues; the parametrized re- 
lations between galaxy mass/luminosity and Mmfall simply 
provide us with a alternative way of specifying the properties 
of the galaxies. 

Fig. [4] and Fig. [5] show the results of our test. Sym- 
bols show results calculated directly from the semi-analytic 
galaxy catalogues and lines are from our parametrized 
model. The stellar mass function is well reproduced, and 
we can also recover the correlation for different stellar mass 
bins: ItPh^Me, 10 10 - 8 /i _1 M Q and 10 u s ^ 1 M s (Fig. gjl. 
For luminosity, the parametrized model is not quite as suc- 
cessful. Although the luminosity function is well-reproduced, 
there are some discrepancies in the dependence of the 
clustering amplitude on luminosity (solid-blue curve in 
Fig. [SJ . Part of the reason for this discrepancy is that our 
parametrization of the L — - Mi n j a u relation has somewhat 
larger x 2 than the M s tars — Mmfall relation (see Table. 1). 
In addition, in order to reproduce the clustering trends as a 
function of luminosity, it is critical to fit the relation for cen- 
tral and for satellites galaxies separately. If we apply a single 



relation for both kinds of galaxies, we obtain the red-dashed 
curve in Fig. [5] which is even more discrepant. Our results 
suggest that in order to reproduce the clustering dependence 
on luminosity in a more exact way, one would need to intro- 
duce an additional dependence of the L — Mint all relation on 
the parameter t in f a u , the time when the galaxy was last the 
central object of its own halo. This does not appear to be 
necessary in order to reproduce the stellar mass dependence 
of galaxy clustering. The reason for this difference is because 
the optical light from galaxies, unlike their stellar mass, is 
heavily influenced by the contribution from the youngest 
stars, which have lifetimes which are short compared to the 
age of the Universe. Once a galaxy becomes a satellite, it 
will fade in luminosity even though its stellar mass remains 
approximately constant. For the sake of simplicity, we will 
not introduce tmfall as an additional parameter in this pa- 
per, but we will come back to this in future work in which 
we consider the colour-dependence of galaxy clustering. 

4.3 The effect of "Orphan" Galaxies 

The majority of HOD models in the literature only consider 
dark matter haloes and subhaloes that can be identified at 
the present time. Satellite galaxies without surrounding sub- 
haloes are thus omitted from the analysis. We now explore 
the effect of these 'orphan' satellite galaxies on our results. 

The left panel of Fig. [6] compares the L — Mi n j a ii rela- 
tion for orphan satellites with the results obtained for cen- 
tral galaxies and satellite galaxies that have retained their 
subhaloes. The right panel of Fig.[6]shows the relative contri- 
bution of central galaxies, satellite galaxies with subhaloes 
and orphan satellites without subhaloes to the luminosity 
function of the galaxies in the semi-analytic catalogue. As 
can be seen, orphan satellites have lower luminosities at a 
given value of Mi n f a u than either central galaxies or satellite 
galaxies with subhaloes - i.e. orphan galaxies are the oldest 
galaxies with the highest mass-to-light ratios in the simula- 
tion. In addition, we see that the contribution of these or- 
phan satellites is highest at the faintest lumninosities. Fig. [7] 
explores the contribution of the orphan satellites to the cor- 
relation function in three different bins of absolute magni- 
tude. The solid curves show the result for all the galaxies 
while the dashed red curves show the result when the or- 
phan satellites are omitted. As can be seen, the orphaned 
satellites contribute heavily to the correlation function of 
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Figure 4. Results from the comparison of the parametrized model and the semi-analytic galaxy catalogue. The left panel shows the 
stellar mass function. The right panel shows correlation functions for three different stellar mass bins: l0 9 h~ 1 MQ, l0 10 - 5 h- 1 MQ and 
lo il.5 ft -i M0 Symbols are for the semi-analytic galaxy catalogue. Solid lines are for our parametrized models. 
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Figure 5. Same as Fig. 4, except for the luminosity function (left) and the correlation function as a function of absolute magnitude 
(right). Symbols are the semi-analytic results, error bars in the right panel are the boot strap error of correlation length for the semi- 
analytic model. Solid lines are for parametrized models where relations for central galaxies and satellite galaxies are fit separately. The 
dashed lines are for models where the fit is for the galaxy population as a whole. 



faint galaxies on scales of less than 1 Mpc. Omission of these 
systems causes the amplitude of the correlation function to 
be underestimated by more than an order of magnitude at 
separations of 0.1 Mpc for galaxies with — 18 < Ahj < —17. 

We note that there are uncertainties in our treat- 
ment of orphan galaxies in the simulation. Some of these 
galaxies may indeed be destroyed or significantly reduced 
in mass by tidal stripping effects. Indeed, the existence 
of a significant intra-cluster light component does sug- 
gest tidal effects or mergers do unbind some of the stars 
in satellite galaxie s l|Arn aboldi 20041; iFeldmeier et al. | |2004 
IZibetti et afll2005h . In face of these uncertainties, we have 



chosen to assume that the visible galaxies survive even after 
their subhalo falls below the resolution limit of the simu- 
lation. It is possible that we over-estimate the number of 
these objects because we do not include tidal stripping on 
the stellar component. However, we believe that "orphan" 
galaxies (at least part of them) are needed in order to ex- 
plain observational results. From Fig. 7 we see that when "or- 
phan" galaxies are excluded, the correlation signal decreases 
at small scales, at odds with observational results (see later 
in Fig. 9 and Fig. 10). In this work we consider all the 'or- 
phan' systems as part of satellite subsamples. 
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Figure 6. L — M in f a u relations(left) and luminosity functions(right) for different types of galaxies from the semi-analytic galaxy 
catalogue: central galaxies (solid lines), satellite galaxies with subhaloes (dashed lines), satellite galaxies without subhaloes (dotted 
lines). The dashed-dotted line in the right-hand panel shows the total luminosity function for all galaxies. 
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Figure 7. Correlation functions for three luminosity bins including (solid lines) and not including (dashed lines) orphan satellite galaxies. 



4.4 Changes in the input parameters 

One advantage of our parametrized approach is that we can 
understand the effect of changing each different parameter 
and thus gain intuition about what changes are necessary 
to bring the models into the closest possible agreement with 
the observational data. This is different in spirit to exploring 
parameter space in the semi-analytic models, because the 
parameters in these models are tied to the physical recipes 
for star formation and feedback rather than the relation be- 
tween halo mass and galaxy properties, which is the focus 
of our approach. 

In the upper panel of Fig. [8j we show how changing 
each of the parameters affects the stellar mass function. 
Note that the normalization constant k is always adjusted 
in order to keep the amplitude of the mass function at 
M s tars = 10 11 Mq fixed. Changing Mo affects the mass scale 
of the transition between the two power laws as well as the 
amplitude of the mass function at both low and at high 
masses. Changes to a affect the shape of the mass function 
at the high mass end, while changes to (3 affect the low mass 



end of the mass function. A change in scatter a has simi- 
lar effect to a change in a, and influences the amplitude of 
the mass function at the high mass end. This is because the 
mass function is relatively flat at low masses and declines 
steeply at high masses, so an increasing amount of scatter 
in the M s tars — Mi„f a u relation will have a strong effect on 
the number of high mass galaxies. 



The lower panels in Fig. [8] show the effect of the 
same parameter changes on the amplitude of the correla- 
tion function evaluated on scales of r = 0.33/i _1 Mpc and 
r — 5.30/i -1 Mpc. We see that a parameter change that 
causes an increase in the number of galaxies of given mass, 
will cause a corresponding decrease in the clustering ampli- 
tude of these systems. This is easy to understand. In order 
to have more galaxies of a given mass in the simulation, they 
must be shifted into lower mass haloes and these low mass 
haloes are more weakly clustered. 
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Figure 8. The effect of changing parameters on the stellar mass function(upper panels) and correlation at scales of r = 0.33/i 1 Mpc 
and r = 5.30h~ 1 Mpc(lower panels). The solid lines represent the best fit model for the M s tars — Mi n f a u relation. 



5 APPLICATION TO SDSS 

In this section, we apply our models to observational data 
from the Sloan Digital Sky S urvey. Recent larg e scale red- 
shift surveys such as 2df GRS(IColless et alJ feoOl) and Sloan 
Digital Sky SurvevCSDSS; lYork et all ||2000T) ) provide galaxy 
samples that are large enough to measure the luminosity 
depen de nce of galaxy clust ering accurately (|Norberg et al.l 
l2002al lbl; IZehavi et al1l2005l ). In this paper, we make use of 
the recent mea surement s of th e projected correlation func- 
tion w(r p ) by I Li et al.l (|2006l ). These authors calculated 
w(r p ) not only as a function of galaxy luminosity, but also 
stellar mass using a sample of galaxies constructed from the 
SDSS Data Release 2 (DR2) data. The methods for esti- 
matin g the stellar masses are described in lKauffmann et al.l 
Here we make use of these measurements to con- 
strain the relation between galaxy luminosity, stellar mass 
and Mi n f a u. To take account the effect of "cosmic variance" 
on the observational results, we have constructed a set of 16 
mock galaxy catalogues from the simulation with exactly the 
same geometry and selection function as in the observational 
sample. The effect of cosmic variance is modelled by plac- 
ing a virtual observer randomly inside the simulation box 
when constructing these mock catalogues. For each mock 
catalogue, we measure w(r p ) for galaxies in the same inter- 
vals of luminosity /stellar mass as in the observations. The 
1 — o variation between these mock catalogues is then added 
as an ad ditional error i n quadrature to the bootstrap errors 
given bv lLi et all dill). The cosmic variance errors become 
significant for the low luminosity and low mass subsamples, 
particularly at large values of r p . The detailed procedure for 
constructing these mock catalogues will be presented in a 
separate paper (Li et al., in preparation). 



To compare our models with the observations, we need 
to either convert w(r p ) to the real space correlation function 
f(r), or to calculate w(r p ) from our model g alaxy catalogue 
direct ly. We tested the method presented bv lHawkins et all 
|2003l ) for converting w(r p ) to £(r) on scales less than around 
30/i _1 Mpc. We find that the conversion amplifies the error 
and the results for the low luminosity and low mass bins are 
then too noisy to provide good constraints on our models. 
Therefore, we derive w(r p ) from our catalogue by integrating 
the real space correlation function £(r): 



w(r p ) 



iW r p 2 + r \\ ) dr \\ 



rdr 



We truncate the integration at r = 60/i -1 Mpc and the re- 
sulting w(r p ) is reliable up to a scale of ~ 10h~ Mpc. 

We now generate a grid of models by systematically 
varying the 5 parameters listed in Table 1. W e compare each 
model with the galaxy luminosity function |Blanton et al.l 
l2003bl ) and the w(r p ) measurements in different ranges in 
luminosity. We define the best fitting model to be the one 
giving a minimum \ 2 defined as follows: 



X 
with 

and 



2 

Xcorr 



X (^0 _|_ Xcorr 



Ncorr 



<j(&SDSs) 



N corr 



^ w(r p ) - w(r p ) S p 
&(w(r p )sDss) 
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Figure 10. Best fit model to the stellar mass function and the correlation function evaluated in different stellar mass bins using data 
from the SDSS. Symbols with error bars are the SDSS results, and dashed red lines are from our parametrized model. Dotted blue 
lines show the results obtained when central and satellite galaxies are treated separately. Green dashed/dotted lines are results for 
central/non-central subsamplcs of the parametrized model when central and satellite galaxies are treated separately. 



N& is the number of points over which the luminosity 
function is measured (iV* = 102 for the r-band absolute 
magnitude ranging from —18 to —23). N corr is the number 
of points over which the correlation function is measured ( 
N corr = 93, ranging from 0.11 to 8.97/i~ Mpc for luminosity 
bins[-19, -18], [-20, -19], [-21, -20], [-22, -21] and from 
0.57 to 8.97/i _1 Mpc for the most luminous bin [-23, -22] 
in the r-band). 

To compare with the SDSS observations, where the 
median galaxy redshift is around 0.1, we correct the r- 
band absolute magnitude M r of each model galaxy to its 
z = 0.1 v alue Mo.i r usin g the if —correction code (kcorrect 
v3_lb) of Blanton et al.l (|2003aT ) and the luminosity evolu- 
tion model of Blanton et al. (2003b). To calculate the K- and 



E-correction, each galaxy is assigned a redshift by placing 
a virtual observer at the centre of the simulation box. The 
redshift as "seen" by the observer is thus determined by the 
comoving distance to the observer and the peculiar velocity 
of the galaxy. The corrected r- band magnitude is given by: 

Mo.l r = — 2.5xlogL + Kcorrection + E cor rection — 51ogft 

Our best fit model has the parameters: Mo = 3.41 x 
IO^/i^Mq, a = 0.221, = 1.67, k = 8.13 and a = 0.440 
for the central galaxies and Mo = 2.58 x 10 11 /i~ 1 M Q , 
a = 0.345, (5 = 3.83, k = 7.71 and a = 0.742 for the satel- 
lite galaxies (see Table 2). The resulting luminosity function 
and correlation functions are shown in Fig. [5] \ 2 ($) /-W* is 
3.348 and the total x 2 is 6.115. Also plotted are the results of 
central and satellite subsamples of our parametrized model, 
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Figure 11. Best fit L — Mi n f a n and M stars — M in f a u relations as constrained by the SDSS data. Blue cir cles are the central galaxies 
and red triangles are satellites. Green lines are the best fitting relations fr om the semi-analytic catal ogue of lCroton et al . (2006). Filled 
circles show the central halo mass from the galaxy-galaxy lensing results of lMa ndclbaum et al. | [|2006ri ; error bars are the 95% confidence 
limits. The results shown are the combined sample of early and late-type galaxies(Mandelbaum, private communication). 



Table 2. Best-fit parameter values for the relations between M in j a u and M a tars and L(M r ) as derived from the SDSS data. 









a 


P 


log(k) 


<7 


X 2 


X 2 (*)/W* 


A/stars (Mq) 


total 
central 
satellite 


3.15X10 11 
3.33X10 11 
4.64x10" 


0.118 
0.276 
0.122 


2.87 
2.59 
2.48 


10.26 
10.27 
10.26 


0.326 
0.241 
0.334 


16.96 
5.351 


2.487 
1.850 


L(M r ) 


central 
satellite 


3.41 xlO 11 
2.58X10 11 


0.221 
0.345 


1.67 
3.83 


8.13 
7.71 


0.440 
0.742 


6.115 


3.348 



shown by green dashed and dotted lines. The drop in the 
correlation function on scales larger than ~ 10/i -1 Mpc is 
not caused by a poor fit; it is due to the truncation of our in- 
tegration of the real space correlation function at r = 60/i _1 
Mpc -1 . 

We now carry out the same analysis for stellar mass, 
rather than luminosity. We have constructed the stellar 
mass function directly from the SDSS DR2 data (Fig. \W\ 
left panel) and use this, in conjunction with the measure- 
ments of w(r v ) as a function of stellar mass published by 
|Li et al.l [2006). to constrain the M s tars-Mi n f a ii relation. 
In the computation of stellar mass function, we have cor- 
rected for the volume effect by weighting each galaxy by 
a factor Of Vsurvey /Vmax, where Vsurvey is the volume for 
the sample and Vmax is the maximum volume over which 
the galaxy could be observed within the sample redshift 
range (0.01 < z < 0.3) and within the range of r— band 
apparent magnitude (14.5 < r < 17.77). A Schechter func- 
tion provides a good fit to our measurement at stellar 
mass Mstars < 10 11 ' 5 h~ 2 Mq. We find best-fit parameters: 
$* = (0.0204 ± 0.0001)/i 3 Mpc -3 , a = -1.073 ± 0.003 and 
Mttars = (4.11 ±0.02) x lO lo /i" 2 M . This corresponds to a 
stellar mass density of (8.779 ± 0.067) x lO 8 /iM Mpc" 3 . 

We fit our models to 30 points along the stellar mass 
function and 20 points along the correlation function for five 
different stellar mass bins ranging from 10 9 to 1O 12 M . The 



parameters of the best-fit models are listed in Table 2. For 
the stellar mass function, the errors due to sample size are 
much smaller than the systematic errors in the stellar mass 
estimates themselves. We therefore assign the same error 
to all points at stellar masses less than 10 11 ' 5 h~ 2 Mq (the 
error is equal to the value at that mass) . In our first attempt 
at fitting the data, we assumed that the Mstars — Mi n f a u 
would be the same for central and satellite galaxies, because 
the relations are very similar in the semi-analytic galaxy 
catalogues. The red dashed lines in Fig. [TO] show the best 
fitting results. The model clearly over-predicts the clustering 
of the more massive galaxies on small scales. If we allow the 
relation between Mstars and M in f a u to differ for central and 
satellite galaxies, we obtain the results shown by the blue 
dotted lines, which are considerably better. 

The best-fit r-band luminosity - Mi n f a u and Mstars — 
Mi n faii relations derived from our models are illustrated in 
Fig. [11] Results are shown separately for central galaxies 
(blue) and satellite galaxies (red). In our models, satellite 
galaxies have lower luminosities and smaller stellar masses 
than central galaxies at a given value of Mi n f a ii- This effect 
is larger for luminosity than for stellar mass, particularly at 
low values of Mi n f a ii- 

For comparison, we also plot the L-M in f a u and 
M s tars — Mjnfau relat io ns fro m the semi-analytic galaxy 
catalogue l|Croton et al.l 120061 ). We transform the bj 
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band magnitude of semi-analytic catalogue to r band 
in SPS S according to the luminosity functions of 
2dFGR S (|Madgwick et all l2002h and SDSS ljBlanton et all 
2003b), and make a shift of 0.9 dex to do the compari- 
son. Because Croton et al assumed a Salpeter initial mass 
function, the stellar mass-to-light ratios of the galaxies in 
their catalogue will be a factor of ~ 2 higher than in the 
SPSS data samp l e. Th is is because the stellar m asses of 
IKauffmann et al] (|2003h have been derived using a | Kroupal 
|200ll ) IMF. As discussed by IKauffmann et aL I (|2003h . the 
Salpeter IMF yields stellar masses for ellipt ical galaxies that 
excee d estimates of their dynamical masses (|Cappellari et al.l 
2006). The Salpeter IMF is clearly unphysical and should be 
dropped. For the comparison shown in Fig. 1111 we have sim- 
ply scaled the stellar masses in the Croton et al catalogues 
by multiplying by a factor 0.5, which should give almost the 
same results as re-running the semi-analytic model with the 
Kroupa IMF. Compared with our results, the semi-analytic 
catalogue yields systematically higher luminosities and stel- 
lar masses at low values of M in f a u, particularly for satellite 
galaxies. The agreement with the semi-analytic catalogue at 
M lnfaU > IP 11 - 6 h -*MQ is quite good 



Recently 



iMandelbaum et al 



Mandelbaum et all (|2005h and 
( 20061 ') have used galaxy-galaxy weak 
lensing measurements from SDSS data to explore the 
explore the connection between galaxies and dark matter. 
They compare the predicted lensing signal from a halo 
model constructed using a dissipationless simulation, and 
extract median/mean halo masses and satellite fractions 
for galaxies as a function of luminosity, stellar mass and 
morphology. We plot their estimates of the mean central 
halo mass as a function of r-band absolute magnitude 
and stellar mass as filled circles in Fig. 1111 The results 
shown are the combined sample of early and late-type 
galaxies(Mandelbaum, private communication). These 
measurements should be compared with our blue points, 
which show the mean halo masses of present-day central 
galaxies. As can be seen, there is remarkably good agree- 
ment between the two methods, both for luminosity and 
for stellar mass. 



6 CONCLUSIONS AND DISCUSSION 

We have constructed a new statistical model of galaxy clus- 
tering for use in high resolution numerical simulations of 
structure formation. Unlike classic halo occupation distri- 
bution (HOD) models, galaxy positions and velocities are 
determined in a self-consistent way by following the full or- 
bital and merging histories of all the haloes and subhaloes 
in the simulation. We believe that this methodology has ad- 
vantages over the traditional approach. Most HOD models 
assume that the galaxy content of a halo of given mass is 
statisti cally independen t of its larger scale environment. Re- 
cently (|Gao et al ] |2005l ) have shown that there exists an age 
dependence of halo clustering: haloes that are formed ear- 
lier are more clustered than haloes that are assembled more 
recently, indicating that this assumption may not be as safe 
as previously thought. Since the positions and the velocities 
of the galaxies in our model are determined directly from 
the simulation, we avoid these difficulties. 

Our methodology also takes into account the contribu- 



tion of "orphaned" galaxies, which have lost their halos due 
to tidal stripping. These galaxies contribute significantly to 
the clustering amplitude of low mass galaxies on scales less 
than ~ lfe -1 Mpc. We have chosen to parametrize the ob- 
served properties of galaxies (in particular their luminosity 
and stellar mass) as a function of the quantity Mi n f a ii, the 
mass of the halo at the epoch when the galaxy was last 
the central object in its halo. Using the semi-analytic model 
results as a reference, we adopt a double power law form 
for this relation, and we show that this allows us to recover 
the mass/luminosity function and the correlation function in 
different ranges of mass and luminosity with high accuracy. 

We then apply our model to measurements of these 
quantities using data from the Sloan Digital Sky Survey. We 
find that for a given value of Mi n f a u, satellite galaxies are 
required to be less luminous and less massive than central 
galaxies. This effect is stronger at low values of Minfall- In 
the semi-analytic models, satellite galaxies fade in luminos- 
ity after they fall into a larger halo because they no longer 
accrete gas a nd their star formatio n rates then decline. The 
catalogues of ICroton etafl (|2006l ) do show differences be- 
tween satellite and central galaxy luminosities at a fixed 
value of Mi n f a u, but the effect is not quite as strong as the 
data demands, particularly for low mass halos. This may in- 
dicate that the efficiency with which baryons are converted 
into stars in low mass halos is higher at the present day 
than it was in the past. The fact that the standard ACDM 
model predicts more low mass galaxi es than observed is 
very well-documented in the literature (|Moore et al.lll999l ). 
Many authors have tried to invoke mec hanisms for "sup- 
press i ng" star formati on in these systems (|Kauffmann et ail 
ll993tlSomervilldl2002h and most of these mechanisms oper- 
ate more effectively at higher redshifts. 

Finally, we compare our relations between galaxy lu- 
minosity, stellar mass and host halo mass with similar re- 
lations derived using galaxy-galaxy weak lensing measure- 
ments. The excellent agreement between these two com- 
pletely independent methods is very encouraging. 
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